Relevance Judgments for Assessing Recall

نویسندگان

  • Peter Wallis
  • James A. Thom
چکیده

-Recall and Precision have become the principle measures of the effectiveness of information retrieval systems. Inherent in these measures of performance is the idea of a relevant document, Although recall and precision are easily and unambiguously defined, selecting the documents relevant to a query has long been recognized as problematic. To compare performance of different systems, standard collections of documents, queries, and relevance judgments have been used. Unfortunately the standard collections, such as SMART and TREC, have locked in a particular approach to relevance that is suitable for assessing precision but not recall. The problem is demonstrated by comparing two information retrieval methods over several queries, and showing how a new method of forming relevance judgments that is suitable for assessing recall gives different results. Recall is an interesting and practical issue, but current test procedures are inadequate for measuring it. Copyright © 1996 Elsevier Science Ltd

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Philosophy of IR Evaluation

• System evaluation: how good are document rankings? • User-based evaluation: how satisfied is user? NIST Why do system evaluation? • Allows sufficient control of variables to increase power of comparative experiments – laboratory tests less expensive – laboratory tests more diagnostic – laboratory tests necessarily an abstraction • It works! – numerous examples of techniques developed in the l...

متن کامل

Assessing thesaurus-based query expansion using the UMLS Metathesaurus

OBJECTIVES Assess query expansion using thesaurus relationships and definitions in the UMLS Metathesaurus for improving searching performance. METHODS The queries from a MEDLINE test collection (OHSUMED) were expanded using synonym, hierarchical, and related term information as well as term definitions from the UMLS Metathesaurus. Documents were retrieved from a word-statistical retrieval sys...

متن کامل

Ranking Retrieval Systems with Partial Relevance Judgements

Some measures such as mean average precision and recall level precision are considered as good system-oriented measures, because they concern both precision and recall that are two important aspects for effectiveness evaluation of information retrieval systems. However, such good system-oriented measures suffer from some shortcomings when partial relevance judgments are used. In this paper, we ...

متن کامل

Median measure: an approach to IR systems evaluation

In this paper we report results from three studies examining 1295 relevance judgments by 36 IR system end-users. We examined both the region of the relevance judgment, from non-relevant to highly relevant, and motivations or levels of their relevance judgments. Our study has three major findings. First, the frequency distributions of relevance judgments by IR system end-users tend to take on a ...

متن کامل

Northeastern University Runs at the TREC13 Crowdsourcing Track

The goal of the TREC 2012 Crowdsourcing Track was to evaluate approaches to crowdsourcing high quality relevance judgments for images and text documents. This paper describes our submission to the Text Relevance Assessing Task. We explored three different approaches for obtaining relevance judgments. Our first two approaches are based on collecting a limited number of preference judgments from ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 32  شماره 

صفحات  -

تاریخ انتشار 1996